S-clause segmentation for efficient syntactic analysis using decision trees

نویسندگان

  • Mi-Young Kim
  • Jong-Hyeok Lee
چکیده

In dependency parsing of long sentences with fewer subjects than predicates, it is difficult to recognize which predicate governs which subject. To handle such syntactic ambiguity between subjects and predicates, this paper proposes an “Sclause” segmentation method, where an S(ubject)clause is defined as a group of words containing several predicates and their common subject. We propose an automatic S-clause segmentation method using decision trees. The S-clause information was shown to be very effective in analyzing long sentences, with an improved performance of 5 percent.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Syntactic Analyzer Using Semantic Connection Units

We implement a Korean syntactic analyzer which decreases many ambiguities in syntax parse trees using segmentation and semantic connection units. We use dependency grammar for parsing. Our syntactic analysis system generates all parse trees of a given sentence. So, the number of parse trees of syntactic analysis is many. To decrease the number of parse trees, we suggested semantic connection un...

متن کامل

Robust N-gram Based Syntactic Analysis Using Segmentation Words

We describe an N-gram based syntactic analysis using a dependency grammar. Instead of generalizing syntactic rules, N-gram information of parts of speech is used to segment a sequence of words into two clauses. A special part of speech, called segmentation word, which corresponds to the beginning or end symbol of clauses is introduced to express a sentence structure. Segmentation words for each...

متن کامل

Deterministic natural language generation from meaning representations for machine translation

This paper describes a deterministic method for generating natural language suited to being part of a machine translation system with meaning representations as the level for language transfer. Starting from Davidsonian/Penman meaning representations, syntactic trees are built following the Penn Parsed Corpus of Modern British English, from which the yield (i.e., the words) can be taken. The no...

متن کامل

Functional Fx-bar Projections for Local and Global Text Structures. the Anatomy of Predication

This paper proposes and discusses issues on local and global text structures, all of them being connected to a lexical concept of predication. The main contributions of the present work comprise: (a) A novel functional X-bar (FX-bar) scheme is advised, aiming to reveal, model and relate the local, clause-level markers and text structures. (b) At global level, two FX-bar schemes are proposed, on...

متن کامل

Non-Dictionary-Based Thai Word Segmentation Using Decision Trees

For languages without word boundary delimiters, dictionaries are needed for segmenting running texts. This figure makes segmentation accuracy depend significantly on the quality of the dictionary used for analysis. If the dictionary is not sufficiently good, it will lead to a great number of unknown or unrecognized words. These unrecognized words certainly reduce segmentation accuracy. To solve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003